Comments for MEDB 5502, Week 06

Topics to be covered

  • What you will learn
    • Test of two proportions
    • Chi-square test of independence
    • Odds ratio versus relative risk
    • Concepts behind the logistic regression model
    • Logistic regression with categorical variables
    • Logistic regression with interactions

Comparing two binary outcomes

  • Is there a difference in the proportion of deaths between male passengers and female passengers on the Titanic?
  • Is there difference in the proportion of patients finishing the full three doses of HPV vaccine between Black women and White women?
  • Does using a ng tube for feeding in pre-term infants increase the probability of successful breast feeding at six months?

Other comparisons involving a binary outcome

  • Is there are difference in the proportion of deaths between first class, second class, and third class passengers?
  • Does age influence the proportion of women finishing the full three doses of HPV vaccine?
  • Controlling for the mother’s age, does using a ng tube for feeding in pre-term infants increase the probability of successful breast feeding at six months?

Hypothesis framework

  • \(H_0:\ \pi_1=\pi_2\)
  • \(H_1:\ \pi_1=\pi_2\)
  • Compute \(\hat p_1\) and \(\hat p_2\) from samples
  • Accept \(H_0\) if \(\hat p_1-\hat p_2\) is close to zero.
    • \(T=(\hat p_1-\hat p_2)/s.e.\)
    • 95% CI: \((\hat p_1-\hat p_2) \pm Z_{\alpha/2}s.e.\)

Data layout, 1 of 2

Data layout, 2 of 2

Confidence interval and test of hypothesis

Live demo, Test of two proportions

Break #1

  • What you have learned
    • Test of two proportions
  • What’s coming next
    • Chi-square test of independence

Chi-square test of independence, 1 of 2

  • Equivalent to test of two proportions
  • Lay out data in two by two table
\[\begin{matrix} & No\ event & Event \\ Treatment & O_{11} & O_{12}\\ Control & O_{21} & O_{22} \end{matrix}\]

Chi-square test of independence, 2 of 2

\[\begin{matrix} & No\ event & Event \\ Treatment & E_{11} = n_1 (1-\hat p_.) & E_{12}=n_1 \hat p_.\\ Control & E_{21} = n_2 (1-\hat p_.) & E_{22}=n_2 \hat p_. \end{matrix}\]
  • \(X^2=\Sigma \frac{(O_{ij}-E_{ij})^2}{E_{ij}}\)

Example: Titanic survival by sex

  • Moderate or large sample size: Pearson Chi-Square
  • Small sample size: Fisher’s Exact test

Live demo, Chi-square test of independence

Break #2

  • What you have learned
    • Chi-square test of independence
  • What’s coming next
    • Odds ratio versus relative risk

Titanic data

       Survived   Died  Total
Female   308      154     462
Male     142      709     851
Total    450      863   1,313

Titanic data, odds of death

       Survived   Died  Total  Odds
Female   308      154     462  2     to 1 against
Male     142      709     851  4.993 to 1 in favor
Total    450      863   1,313

Odds ratio = 4.993 / 0.5 = 9.986

Titanic data, probability of death

       Survived   Died  Total  Probability
Female   308      154     462    0.3333
Male     142      709     851    0.8331
Total    450      863   1,313

Relative risk = 0.8331 / 0.3333 = 2.5

Which is better

  • Relative risk is consistent with how most people think, but
    • Relative risk cannot always be computed
    • Relative risk has an ambiguity

Fractions are funny

  ----------  ----------
  0.8  (4/5)  1.25 (5/4)  
  0.75 (3/4)  1.33 (4/3)  
  0.67 (2/3)  1.50 (3/2)  
  0.50 (1/2)  2.00 (2/1)  
  ----------  ----------

Swapping the numerator and denominator

  • Odds ratio = male odds / female odds
    • = 4.993 / 0.5 = 9.986
  • Odds ratio = female odds / male odds
    • = 0.5 / 4.993 = 0.1001
  • Relative risk = male probability / female probability
    • = 0.8331 / 0.3333 = 2.4996
  • Relative risk = female probability / male probability
    • = 0.3333 / 0.8331 = 0.4001

Interpretability, 1 of 3

  • Change from 25% probability to 50% probability
  • Change from 3 to 1 odds against to even odds
    • RR = 2, OR = 3

Interpretability, 2 of 3

  • Change from 25% probability to 75% probability
  • Change from 3 to 1 odds against to 3 to 1 odds in favor
    • RR = 3, OR = 9

Interpretability, 3 of 3

  • Change from 10% probability to 90% probability
  • Change from 9 to 1 odds against to 9 to 1 odds in favor
    • RR = 9, OR = 81

Designs that rule out the use of the relative risk, 1 of 2

         Cancer cases  Controls  Total  
Balding       72          82      154  
 Hairy        55          57      112  
 Total       129         139      268  

Designs that rule out the use of the relative risk, 2 of 2

         Heart disease     Healthy     Total  
Balding    127 (9.4%)   1,224 (90.6%)  1,351  
 Hairy     548 (6.7%)   7,611 (93.3%)  8,159  
 Total     675          8,835          9,510  

Covariate adjustments

          Children   No children  Total  
Epilepsy  232 (40%)   354 (60%)    586  
Control    79 (72%)    30 (28%)    109  
Total     311         384          695  

Ambiguous and confusing situations

  • One hundred pound sack of potatoes
    • 99% water, 1% potato
    • Weighs 1 pound after completely drying
    • Instead dry until 2% potato
      • How much does it weigh then?

Example: physician recommendations

                 No cath      Cath      Total  
 Male patient   34  (9.4%)  326 (90.6%)  360  
Female patient  55 (15.3%)  305 (84.7%)  360  
         Total  89          631          720  

Example: Breast feeding study

           Continued bf  Stopped bf  Total  
Treatment   19 (37.3%)   32 (62.7%)    51  
 Control     5  (8.8%)   52 (91.2%)    57  
  Total     24           84           108  

Break #3

  • What you have learned
    • Odds ratio versus relative risk
  • What’s coming next
    • Concepts behind the logistic regression model

What is logistic regression?

  • Binary outcome
  • Categorical or continuous predictors
  • Linear on the log odds scale

Why log odds?

  • Statistical model of surgery
    • Estimates probability of demise
    • First prediction: probability=1.2
  • Log odds prevent out of range predictions

A linear model for probability, 1 of 2

A linear model of probability, 2 of 2

A multiplicative model for probability

The relationship between odds and probability

  • odds = prob / (1-prob)
  • prob = odds / (1+odds)
    • \(0 \le\) prob \(\le 1\)
    • \(0 \le\) odds \(\le \infty\)
      • \(0 \le\) odds against \(\le 1\)
      • \(1 \le\) odds in favor \(\le \infty\)

A log odds model for probability, 1 of 4

A log odds model for probability, 2 of 4

A log odds model for probability, 3 of 4

A log odds model for probability, 4 of 4

An example of a log odds model with real data, 1 of 3

An example of a log odds model with real data, 2 of 3

  • log odds = -16.72 + 0.577 \(\times\) GA

An example of a log odds model with real data, 3 of 3

  • log odds = -16.72 + 0.577 \(\times\) 30 = 0.59
  • odds = exp(log odds) = 1.8
  • prob = odds / (1+odds`) = 0.64

Live demo, Concepts behind the logistic regression model

Break #4

  • What you have learned
    • Concepts behind the logistic regression model
  • What’s coming next
    • Logistic regression with categorical variables

Categorical variables in a logistic regression model, 1 of 3

  • 1st class odds: 129/193 = 0.67 or 193/129 = 1.5
  • 2nd class odds: 161/119 = 1.35 or 119/161 = 0.74
  • 3rd class odds: 573/138 = 4.15 or 138/573 = 0.24

Categorical variables in a logistic regression model, 2 of 3

- 1.50 / 0.24 = 6.212 - 0.74 / 0.24 = 3.069

Categorical variables in a logistic regression model, 3 of 3

  • 0.74 / 1.50 = 0.494
  • 0.24 / 1.50 = 0.161

Live demo, Logistic regression with categorical variables

Break #5

  • What you have learned
    • Logistic regression with categorical variables
  • What’s coming next
    • Logistic regression with interactions

Interactions in logistic regression

  • Odds ratios vary by a third factor
  • Interpretation is more tedious

Odds ratios for first class

Odds ratio for second class

Odds ratio for third class

Logistic regression with interaction

  • Odds ratio for 3rd class = 4.608
  • Odds ratio for 1st class = 4.608 \(\times\) 6.572 = 30.2
  • Odds ratio for 2nd class = 4.608 \(\times\) 9.289 = 42.8

Live demo, Logistic regression with interactions

Summary

  • What you have learned
    • Test of two proportions
    • Chi-square test of independence
    • Odds ratio versus relative risk
    • Concepts behind the logistic regression model
    • Logistic regression with categorical variables
    • Logistic regression with interactions

Additional topics??